Search CORE

56 research outputs found

On the Impact of Entity Linking in Microblog Real-Time Filtering

Author: Berardi G.
Han Z.
Ounis I.
Robertson S.
Soboroff I.
Zhang Y.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 10/11/2016
Field of study

Microblogging is a model of content sharing in which the temporal locality of posts with respect to important events, either of foreseeable or unforeseeable nature, makes applica- tions of real-time filtering of great practical interest. We propose the use of Entity Linking (EL) in order to improve the retrieval effectiveness, by enriching the representation of microblog posts and filtering queries. EL is the process of recognizing in an unstructured text the mention of relevant entities described in a knowledge base. EL of short pieces of text is a difficult task, but it is also a scenario in which the information EL adds to the text can have a substantial impact on the retrieval process. We implement a start-of-the-art filtering method, based on the best systems from the TREC Microblog track realtime adhoc retrieval and filtering tasks , and extend it with a Wikipedia-based EL method. Results show that the use of EL significantly improves over non-EL based versions of the filtering methods.Comment: 6 pages, 1 figure, 1 table. SAC 2015, Salamanca, Spain - April 13 - 17, 201

arXiv.org e-Print Archive

Crossref

The Feasibility of Brute Force Scans for Real-Time Tweet Search

Author: Boncz P.
Ounis I.
Soboroff I.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 04/12/2015
Field of study

The real-time search problem requires making ingested doc-uments immediately searchable, which presents architectural challenges for systems built around inverted indexing. In this paper, we explore a radical proposition: What if we abandon document inversion and instead adopt an architec-ture based on brute force scans of document representations? In such a design, “indexing ” simply involves appending the parsed representation of an ingested document to an exist-ing buffer, which is simple and fast. Quite surprisingly, ex-periments with TREC Microblog test collections show that query evaluation with brute force scans is feasible and per-formance compares favorably to a traditional search archi-tecture based on an inverted index, especially if we take ad-vantage of vectorized SIMD instructions and multiple cores in modern processor architectures. We believe that such a novel design is worth further exploration by IR researchers and practitioners

CiteSeerX

Crossref

Objective and automated protocols for the evaluation of biomedical search engines using No Title Evaluation protocols

Author: AM Cohen
D Demner-Fushman
E Amitay
EM Voorhees
Fabien Campagne
I Soboroff
JA Aslam
K Sparck Jones
K Sparck Jones
KC Dorff
M Fuller
P Boldi
P Dong
R Nuray
S Buttcher
SE Robertson
SF Kim
Y Yue
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background The evaluation of information retrieval techniques has traditionally relied on human judges to determine which documents are relevant to a query and which are not. This protocol is used in the Text Retrieval Evaluation Conference (TREC), organized annually for the past 15 years, to support the unbiased evaluation of novel information retrieval approaches. The TREC Genomics Track has recently been introduced to measure the performance of information retrieval for biomedical applications. Results We describe two protocols for evaluating biomedical information retrieval techniques without human relevance judgments. We call these protocols No Title Evaluation (NT Evaluation). The first protocol measures performance for focused searches, where only one relevant document exists for each query. The second protocol measures performance for queries expected to have potentially many relevant documents per query (high-recall searches). Both protocols take advantage of the clear separation of titles and abstracts found in Medline. We compare the performance obtained with these evaluation protocols to results obtained by reusing the relevance judgments produced in the 2004 and 2005 TREC Genomics Track and observe significant correlations between performance rankings generated by our approach and TREC. Spearman's correlation coefficients in the range of 0.79–0.92 are observed comparing bpref measured with NT Evaluation or with TREC evaluations. For comparison, coefficients in the range 0.86–0.94 can be observed when evaluating the same set of methods with data from two independent TREC Genomics Track evaluations. We discuss the advantages of NT Evaluation over the TRels and the data fusion evaluation protocols introduced recently. Conclusion Our results suggest that the NT Evaluation protocols described here could be used to optimize some search engine parameters before human evaluation. Further research is needed to determine if NT Evaluation or variants of these protocols can fully substitute for human evaluations.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

A Random Walk Model for Item Recommendation in Social Tagging Systems

Author: Basu C.
Bogers T.
Breese J. S.
Claypool M.
Good N.
Gori M.
Peng J.
Popescul A.
Salakhutdinov R.
Si L.
Soboroff I.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

Cervical lymph node metastasis in adenoid cystic carcinoma of the larynx: a collective international review

Author: A Coca-Pelaz
A Ferlito
A Ferlito
A Iosipescu
A Negro Del
A Oplatek
AA Hsu
AA Kekelidze
AD Friedman
AF Costa
Afshin Teymoortash
Alena Skálová
Alessandra Rinaldo
Alfio Ferlito
Andrés Coca-Pelaz
Antonio Cardesa
AP Calzada
AR Khan
AS Jones
Asterios Triantafyllou
B Cady
BJ Soboroff
BM Abercromby
Bruce M. Wenig
BW Murray
C Messaoudi
Carl E. Silver
Carlos Suárez
CDM Fletcher
CJ Balamucki
CW Gross
D Ide
D Morais Pérez
D Testa
D Veivers
DA Silverman
DG Sessions
Douglas R. Gnepp
DT Donovan
E Kerviler de
E Zvrko
F Eschwege
F Lemaître
F Nhembe
FJ Putney
G Eigler
G Pincini
GB Leonardelli
GB Stillwagon
GL Adams
GL Ellis
H Bourgeois
HA Gaissert
Henrik Hellquist
HL Wang
I Ganly
I Serafini
J Broeckaert
J Cohen
J Damborenea Tajada
J Fordice
J Leroux-Robert
J Olofsson
JA Houle
JA Murtagh
Jatin P. Shah
JE Ash
Jean Anderson Eloy
Jesus E. Medina
JG Batsakis
JH Whicker
JM Boland
JM Dueñas Parrilla
JM Toomey
JN Anderson Jr
JR McDonald
JR Paredes Osado
JT Parsons
Juan P. Rodrigo
Justin A. Bishop
K Donath
K Fleischer
K Mahlstedt
K Muzaffar
K. Thomas Robbins
Karen T. Pitman
Kenneth O. Devaney
KJ Misiukiewicz
KY Lam
L Bignardi
L Pietrantoni
L Rosenfeld
LA Lee
Leon Barnes
Lester D. R. Thompson
Luiz P. Kowalski
LV Ackerman
M Amit
M Amit
M Gerard
M Javadi
M Michal
M Zhang
Marc Hamoir
MC Wang
MH Weiss
Michelle D. Williams
NN Carmel
O Aydin
P Berdal
P Berdal
Patrick J. Bradley
PB Zald
PD Freedman
Pieter J. Slootweg
PM Dubal
PM Scott
Primož Strojan
R Allachy
R Jelínek
R Kramer
R Min
R Srivastava
RC Mankodi
Remco de Bree
RH Spiro
RH Spiro
RH Spiro
RH Spiro
RI Haddad
Robert P. Takes
RP Hogg
RR Seethala
RV Moukarbel
S Ahued
S Alavi
S Lloyd
S Weert van
SE Mills
SL Wain
SP Gadomski
T Gierek
T Nagao
TK Nielsen
TL Tewfik
TS Li
Vincent Vander Poorten
VL Schramm Jr
W Liu
William M. Mendenhall
WL Marsh Jr
X Qian
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Adenoid cystic carcinoma (AdCC) of the head and neck is a well-recognized pathologic entity that rarely occurs in the larynx. Although the 5-year locoregional control rates are high, distant metastasis has a tendency to appear more than 5 years post treatment. Because AdCC of the larynx is uncommon, it is difficult to standardize a treatment protocol. One of the controversial points is the decision whether or not to perform an elective neck dissection on these patients. Because there is contradictory information about this issue, we have critically reviewed the literature from 1912 to 2015 on all reported cases of AdCC of the larynx in order to clarify this issue. During the most recent period of our review (1991-2015) with a more exact diagnosis of the tumor histology, 142 cases were observed of AdCC of the larynx, of which 91 patients had data pertaining to lymph node status. Eleven of the 91 patients (12.1%) had nodal metastasis and, based on this low proportion of patients, routine elective neck dissection is therefore not recommended

Lirias

Crossref

Springer - Publisher Connector

Repositorio Institucional de la Universidad de Oviedo

Incorporating contextual information in recommender systems using a multidimensional approach

Author: Adomavicius G.
Aggarwal C. C.
Alexander Tuzhilin
Ansari A.
Basu C.
Billsus D.
Breese J. S.
Caglayan A.
Chien Y.-H.
Claypool M.
Condliff M.
Cortes C.
Delgado J.
Dietterich T. G.
Fan J.
Gediminas Adomavicius
Getoor L.
Han J.
Herlocker J. L.
Hill W.
Im I.
Klein N. M.
Koller D.
Lang K.
Lee W. S.
Lussier D. A.
Mobasher B.
Mooney R. J.
Mooney R. J.
Nakamura A.
Oard D. W.
Pennock D. M.
Ramesh Sankaranarayanan
Resnick P.
Sarwar B.
Sarwar B.
Shahana Sen
Shardanand U.
Soboroff I.
Sparck Jones K.
Tran T.
Ungar L. H.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

Humans optional? Automatic large-scale test collections for entity, passage, and entity-passage retrieval

Author: A Kembhavi
B Dalvi
C Boston
C Shah
C Wade
C Xiong
C Xiong
C Xiong
D Bodoff
E Choi
E Gabrilovich
E Yilmaz
EM Voorhees
G Demartini
GK Jayasinghe
GV Cormack
H Bast
H Bast
H Bota
H Raviv
H Zhang
I Soboroff
J Allan
J Dalton
J Dalton
J Foley
J Kamps
J Kamps
J O’Connor
J Pennington
JP Callan
JR Frank
K Balog
L Azzopardi
L Dietz
L Dietz
L Dietz
M Kaszkiel
M Schuhmacher
N Asadi
O Alonso
P Arvola
P Ferragina
PN Mendes
R Berendsen
R Berendsen
R Blanco
R Nogueira
S Arnold
S Chatterjee
S MacAvaney
SM Beitzel
T Sakai
U Sawant
X Wan
Y Yang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/03/2020
Field of study

Manually creating test collections is a time-, effort-, and cost-intensive process. This paper describes a fully automatic alternative for deriving large-scale test collections, where no human assessments are needed. The empirical experiments confirm that automatic test collection and manual assessments agree on the best performing systems. The collection includes relevance judgments for both text passages and knowledge base entities. Since test collections with relevance data for both entity and text passages are rare, this approach provides a cost-efficient way for training and evaluating ad hoc passage retrieval, entity retrieval, and entity-aware text retrieval methods

Crossref

Enlighten

Problems with Kendall's Tau

Author: Sanderson M.
Soboroff I.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2007
Field of study

This poster describes a potential problem with a relatively well used measure in Information Retrieval research: Kendall's Tau rank correlation coefficient. The coefficient is best known for its use in determining the similarity of test collections when ranking sets of retrieval runs. Threshold values for the coefficient have been defined and used in a number of published studies in information retrieval. However, this poster presents results showing that basing decisions on such thresholds is not as reliableas has been assumed

CiteSeerX

White Rose Research Online

Problems with Kendall's Tau

Author: Sanderson M
Soboroff I
Publication venue: ACM (New York, USA)
Publication date: 01/01/2007
Field of study

RMIT Research Repository

Bibliotòpics

Author: Craswell N.
Jelinek F.
Soboroff I.
Publication venue
Publication date: 01/01/2013
Field of study

This infographic report seeks to check what part of truth is hidden behind librarian stereotypes. The bun, the glasses, the antipathy, the introversion... Are Catalan librarians like this in the 21st century? Do they share any personal traits that have led them to the profession? To answer this questions, more than 500 qualified professionals from Catalonia (Spain) in working age were surveyed. The results are presented in an informal but rigorous way

International Migration, Integration and Social Cohesion online publications